12 research outputs found
Recommended from our members
FUNCTION AND DISSIPATION IN FINITE STATE AUTOMATA - FROM COMPUTING TO INTELLIGENCE AND BACK
Society has benefited from the technological revolution and the tremendous growth in computing powered by Moore\u27s law. However, we are fast approaching the ultimate physical limits in terms of both device sizes and the associated energy dissipation. It is important to characterize these limits in a physically grounded and implementation-agnostic manner, in order to capture the fundamental energy dissipation costs associated with performing computing operations with classical information in nano-scale quantum systems. It is also necessary to identify and understand the effect of quantum in-distinguishability, noise, and device variability on these dissipation limits. Identifying these parameters is crucial to designing more energy efficient computing systems moving forward. In this dissertation, we will provide a physical description of finite state automaton, an abstract tool commonly used to describe computational operations under the Referential Approach to physical information theory. We will derive the fundamental limits of dissipation associated with a state transition in deterministic and probabilistic finite state automaton, and propose efficacy measures to capture how well a particular state transition has been physically realized. We will use these dissipation bounds to understand the limits of dissipation during learning during training and testing phases in feed-forward and recurrent neural networks. This study of dissipation in neural network provides key hints at how dissipation is fundamentally intertwined with learning in physical systems. These ideas connecting energy dissipation, entropy and physical information provide the perfect toolkit to critically analyze the very foundations of computing, and our computational approaches to artificial intelligence. In the second part of this dissertation, we derive the non-equilibrium reliable low dissipation condition for predictive inference in self-organized systems. This brings together the central ideas of homeostasis, prediction and energy efficiency under a single non-equilibrium constraint. The work was further extended to study the relationship between adaptive learning and the reliable high dissipation conditions, and the exploitation-exploration trade-offs in active agents. Using these results, we will discuss the differences between observer dependent and independent computing, and propose an alternative novel descriptive framework of intelligence in physical systems using thermodynamics. This framework is called thermodynamic intelligence and will be used to guide the engineering methodologies (devices and architectures) required to implement these descriptions
Recommended from our members
Physical Information Theoretic Bounds on Energy Costs for Error Correction
With diminishing returns in performance with scaling of traditional transistor devices, there is a growing need to understand and improve potential replacements technologies. Sufficient reliability has not been established in these devices and additional redundancy through use of fault tolerance and error correction codes are necessary. There is a price to pay in terms of energy and area, with this additional redundancy. It is of utmost importance to determine this energy cost and relate it to the increased reliability offered by the use of error correction codes. In this thesis, we have determined the lower bound for energy dissipation associated with error correction using a linear (n,k) block code. The bound obtained is implementation independent and is derived from fundamental considerations and it allows for quantum effects in the channel and decoder. We have also developed information theoretic efficacy measures that can quantify the performance of the error correction and their relationship to the corresponding energy cost
Multiplexed gradient descent: Fast online training of modern datasets on hardware neural networks without backpropagation
We present multiplexed gradient descent (MGD), a gradient descent framework
designed to easily train analog or digital neural networks in hardware. MGD
utilizes zero-order optimization techniques for online training of hardware
neural networks. We demonstrate its ability to train neural networks on modern
machine learning datasets, including CIFAR-10 and Fashion-MNIST, and compare
its performance to backpropagation. Assuming realistic timescales and hardware
parameters, our results indicate that these optimization techniques can train a
network on emerging hardware platforms orders of magnitude faster than the
wall-clock time of training via backpropagation on a standard GPU, even in the
presence of imperfect weight updates or device-to-device variations in the
hardware. We additionally describe how it can be applied to existing hardware
as part of chip-in-the-loop training, or integrated directly at the hardware
level. Crucially, the MGD framework is highly flexible, and its gradient
descent process can be optimized to compensate for specific hardware
limitations such as slow parameter-update speeds or limited input bandwidth
Thermodynamic Computing
The hardware and software foundations laid in the first half of the 20th
Century enabled the computing technologies that have transformed the world, but
these foundations are now under siege. The current computing paradigm, which is
the foundation of much of the current standards of living that we now enjoy,
faces fundamental limitations that are evident from several perspectives. In
terms of hardware, devices have become so small that we are struggling to
eliminate the effects of thermodynamic fluctuations, which are unavoidable at
the nanometer scale. In terms of software, our ability to imagine and program
effective computational abstractions and implementations are clearly challenged
in complex domains. In terms of systems, currently five percent of the power
generated in the US is used to run computing systems - this astonishing figure
is neither ecologically sustainable nor economically scalable. Economically,
the cost of building next-generation semiconductor fabrication plants has
soared past $10 billion. All of these difficulties - device scaling, software
complexity, adaptability, energy consumption, and fabrication economics -
indicate that the current computing paradigm has matured and that continued
improvements along this path will be limited. If technological progress is to
continue and corresponding social and economic benefits are to continue to
accrue, computing must become much more capable, energy efficient, and
affordable. We propose that progress in computing can continue under a united,
physically grounded, computational paradigm centered on thermodynamics. Herein
we propose a research agenda to extend these thermodynamic foundations into
complex, non-equilibrium, self-organizing systems and apply them holistically
to future computing systems that will harness nature's innate computational
capacity. We call this type of computing "Thermodynamic Computing" or TC.Comment: A Computing Community Consortium (CCC) workshop report, 36 page
Recommended from our members